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ABSTRACT 
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possible "deabbreviations" to just a few good guesses. This is done by examining 
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easy to use and results in significantly improved program comprehensibility. 
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Abbreviations of natural-language words are common in computer programs, most frequently for pro- 
cedure and variable names. Some evidence suggest that abbreviations, as opposed to the full natural- 
language words from which they are derived, adversely affect comprehensibility of a source program. 
[Schneiderman, 1980, pages 70-75] and [Weissman, 1974] showed experiments in which significant 
improvement in ability to answer questions about a program occurred when mnemonic names were used 
instead of abbreviated names in the program, and [Laitinen, 1992] showed programmers preferred to 
work with programs having natural-language- word or "natural" names. Natural names can be developed 
early in the software design cycle by a structured methodology like [Laitinen and Mukari, 1992]. But 
not all programmers will accept tliis additional imposition, and more importantly, this methodology can- 
not be used with existing programs that need maintenance. So we have explored semi-automatic 
replacement of abbreviations by natural names in existing programs and have developed a protot>q)e 
tool. This new tool for software engineering we call a "deabbreviator". Its output can either replace 
the original program or serve as an aid to its understanding, facilitating subsequent modification or 
reuse of the program. 

Some simple tools replace abbreviations according to a fixed list, including DataLift from Peoplesmith, 
Inc,, Magic Typist from Olduvai Corp., and The Complete Writer’s Toolkit from Systems Compatibility 
Corp. [Laitinen, 1992] mentions a more sophisticated tool that also checks for conflicts among replace- 
ment names. Unfortunately, abbreviations vary considerably between applications, and the general- 
purpose replacement lists of these prognims are of only limited help for the often specialized task of 
software description. For example, "Ibr" is common abbreviation for "labor" in business, but not in a 
biological laboratory. [Rosenberg, 1992] lists an average of two different interpretations for ever}’ 
three-word abbreviation, even for the restricted domain of "information technology" and the restriction 
to common interpretations. An improvement would be application-specific abbreviation pairs plus 
"deabbreviation" rules that can intelligently guess replacements not in the list. But the challenge is to 
find a way to limit the seemingly unbounded number of words that could be checked as the source of 
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an abbreviation. 

2. Data structures 

Our deabbreviation tool uses the following support files, all hashed: 

- a dictionary list of 29,000 common English words; 

- a list of reserved words for the programming language and/or operating system whose programs 
are being analyzed; 

- an auxiliary list of valid English words found in previous runs but not in the preceding files; 

- common words of the application domain, used as additional comment words (optional); 

- standard abbreviations used in computer software, with their deabbreviations: 

- replacements accepted for previous programs (optional). 

The dictionary is necessary to define the acceptable natural names in a program. To obtain it, we com- 
bined wordlists from the Berkeley Unix "spell" utility, the GNU Emacs "ispell" utility, the first author’s 
papers, some saved email, and captions from 100,000 pictures at a Navy Laboratory. We removed 
abbreviations from these sources manually, and were particularly liberal in removing words under four 
letters in length. This gave us a reasonably broad vocabulary of about 29,000 English words. This dic- 
tionary is then supplemented with reserved words specific to the programming language being used, and 
any new words confirmed by the user on previous runs. The user may also include a optional file of 
words specific to the real-world domain that the program addresses; these can duplicate the dictionary', 
but have special priority in trying to find deabbreviations. 

We also manually compiled a "standard abbreviation list" of 168 entries from a study of programs writ- 
ten by a variety of programmers, supplemented by the list of [Rosenberg, 1992]; example entries are 
"ptr -> pointer" and "term -> terminal". These are the abbreviations we observed repeatedly in a wide 
range of program examples. The list was kept short to reduce domain-dependent alternatives; and it 
was not made any shorter because explicit listing saves time with common abbreviations, although 
many entries could be derived from the abbreviation rules discussed next. The list is supplemented by 
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deabbreviations confirmed or provided by the user, as we will discuss. 

3. Deabbreviation methods 

Deabbreviating means replacing an abbreviated name with a more understandable "natural” one consist- 
ing of whole English words. To deabbreviate, we "generate-and-test": We select candidates and abbre- 
viate them various ways, trying to obtain a match with a given abbreviation. The abbreviation rules 
derive mostly from our study of example programs, with some ideas from [Bourne and Ford, 1961]. 
Three "word abbreviation" methods are used: (1) match to a standard deabbreviation, previous user 
replacement, or "analogy" (see below), as confirmed by lookup in the corresponding hash table; (2) 
truncation on its right end of some word in the program’s comments, as long as at least two letters 
remain and at least two letters are eliminated; and (3) elimination of vowels in a comment word, pro- 
vided at least two letters remain including the first. Then a "full abbreviation" is either one, two, or 
three "word abbreviations" appended together. These methods together explain about 98 percent of 
abbreviations we observ'ed in example programs, as abbreviations rarely get more complicated than 
three-word, although we do have ways to find longer abbreviations as discussed below. For an -letter 
word, word-abbreviation method (1) generates 0(1) abbreviations for matching, method (2) 0{m), and 
method (3) 0(1); so the whole algorithm generates O(m^) full abbreviations. Cubic behavior was 
observed to be the limit if processing were to be fast enough. 

A key feature of our approach is the restriction of word-abbreviation methods (2) and (3) to words in 
the comments on the source program, supplemented by optional application-domain words. With at 
29000-word dictionary, there would be 25 million million combinations to abbreviate otherwise. We 
believe that if a program is properly commented, the comments should contain most of the natural 
names that would be appropriate to use in its variable and procedure names beyond the standard pro- 
gramming words like "pointer". But we do not insist that comment words be near their abbreviations, 
as comments can have global scope and can reference backward as well as forward. Note that word- 
abbreviation methods (2) and (3) leave intact the initial letter of a word or phrase, consistent with the 
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programs we studied, and that eliminates many possibilities. 

Since a word can have many abbreviations, the nondeterminism of Prolog is valuable, as it permits 
backtracking with deptli-first search to get further deabbreviations until the user finds one acceptable. 
The above ordering of the methods is used heuristically in this depth-first search, so that direct hash 
lookups are tried before deleting characters, single-word abbreviations before two-word, and two-word 
before three-word. These heuristics derive from cost analysis of the methods. With two-word and 
three-word methods, an additional heuristic, based on study of example programs, considers rightmost 
splits first, so ’’tempvar" would be first split into ’’tempv" and ”ar", then "temp" and "var". 

4. Control structure 

The control structure of the deabbreviation tool has similarities to those of spelling correctors [Peterson, 
1980], text-error detectors [Kukich, 1992], and copy-editor assistants like [Dale, 1989]. But abbreviated 
words are farther from their sources than misspelled words are, so a deabbreviator requires more com- 
putation per word; and abbreviating loses so much information that it cannot be inverted and we must 
rely on generate-and-test. 

Processing has three passes. On the first pass, the input program is read in, comment words more than 
3 characters are stored, and non -comment words are looked up in the dictionary. Reserved words of the 
programming language, words within quotation marks, words containing numbers, and words already 
analyzed are ignored. Morphology and case considerations complicate the lookup. To simplify matters, 
our dictionar}' and standard abbreviations are kept in lower case, and comment and unknown non- 
comment words are converted to lower case for matching. Similarly, most of the dictionary is minus 
the standard English suffixes "s", "ed", and ”ing", and unknown words for comparison are stripped of 
any such endings, using the appropriate morphological rules of English for undoubling of final con- 
sanants, adding a final "e", and changing "i" to "y". The unknown word and all stripped words are tried 
separately, to catch misleading words like "string" and "fuss". 
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The second pass then individually considers the words that did not match on the first pass. Full abbre- 
viations are generated to try to match the given word. Plurals are also tried to match to the given word 
or pieces of it. Possible deabbreviations are individually displayed to the user for approval; if the user 
rejects them all, the user is asked if the word can be added to the dictionary. If not, the user must sup- 
ply a replacement, which is recursively checked to contain only dictionary words, allowing underscores 
for punctuation. Replacements can be specified either global for the entire program or local to the pro- 
cedure definition. 

We considered ranking of the alternative deabbreviations for a word during the second pass, and 
presenting the most likely to the user first. However, we concluded that this was not cost-effective. It 
would require about 10 seconds per unknown word in our current implementation. To do it properly, 
we must detenuine the frequency of every one of the 29,000 words in the dictionary, and frequencies 
would vary with domain. Abbreviation methods also vary considerably in frequency even with the 
same programmer. 

The third pass changes the names in the program as decided on the second pass. Name collisions can 
occur if user-approved deabbreviations are identical with words already used in the program, and the 
user is then given a warning, 

5. Analogies 

Replacements can also be indirectly inferred from analogies, and this is one of the most important 
features of tlie tool. For instance, if the user confirms that they want to replace "buftempfoobar" with 
"buffer__temporary_foobar", we can infer that a replacement for ‘'buf is "buffer" and for "temp" is 
"temporary", even if we do not recognize "buf’ and "foobar". Previous user replacements can be found 
within an analogy too, to simplify matters, and multiple deabbreviations of the same abbreviation must 
be permitted to be inferred (although at lesser priority than user-sanctioned deabbreviations). We 
discovered that analogies are very helpful in deabbreviating real programs because user names are often 
interrelated. Such analogies should be found after a replacement is approved by the user, not during the 
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generation of deabbreviations when there would be too many possibilities to consider. Caching of 
analogies that are rejected is important to prevent requerying tliein. 

Replacements can also be inferred from analogies between two previous replacements. For instance, if 
the user confirmed that ’’temp" should be replaced by "temporary", and then confirmed "tempo" should 
be replaced by "temporary _operator", we should infer that one possible replacement for "o" in this pro- 
gram is "operator" even though "tempo" is an English word, "tempo" is a truncation of "temporary", 
and one-word abbreviations are quite ambiguous. Then if "tempb" occurs later and "b" is known to be 
replaceable by "buffer", the deabbreviation "temporary ^buffer" will be the first guess. Analogizing can 
also infer deabbreviations more than three words long that the standard abbreviation rules cannot; for 
instance, if "tempg" is "temporar)^__global", "var" is "variable", and "nm" is "name", then "tempgvamm" 
is "temporary_global_variable__name". With such analogizing, the system can become progressively 
more able to guess what the user means as it works on a program. 

Acronyms are not recognized by the standard deabbreviation methods, so analogies are especi^dly 
important for them, to recognize them as components of abbreviations. Acronyms make poor names 
since they highly ambiguous, but they were rare in programs we studied. 

Previous user replacements (whether generated by the program or by the user) can be very helpful in 
analogies. We give the user the option of keeping the replacements (both explicit and inferred) from 
earlier programs that were deabbreviated, so that modules of related programs can be treated together. 

6. Experiments 

To test the tool, 15 C programs were deabbreviated by the second author and 7 Prolog programs were 
deabbreviated by the first author. The programs were written by a variety of programmers for a variety 
of applications. CPU time averaged about 1.5 seconds per source-program word with semi-compiled 
Quintus Prolog running on a Sun SparcStation, witli questions to the user, after initialization, coming at 
the comfortable rate of about one per every two seconds, although there was significant inter-program 
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variation. The C programs had a total of 20,990 symbols, all but 1096 distinct occurrences of which 
were a-priori acceptable (as either dictionary entries, numbers, or punctuation); of the 1096, 598 pro- 
posed deabbreviations were accepted by the user, 434 proposed deabbreviations were rejected, 274 were 
added as new words to the dictionary, 175 required explicit user replacements, and 49 short codes were 
left untouched. The Prolog programs had 5850 symbols, all but 605 distinct occurrences of which were 
a-priori acceptable; of the 605, 371 proposed deabbreviations were accepted by the user, 123 proposed 
deabbreviations were rejected, 9 were added as new words to the dictionary, 140 required explicit user 
replacements, and 85 short codes were left untouched. Thus the tool guesses acceptable replacements 
more than half the time, and appears to save users significant time in the task of replacing names in a 
program. These results also indicate that multiple possible deabbreviations are not common, and sup- 
port our decision not to rank deabbreviations of a word. 

We tested some tool output on 24 subjects, third-quarter students of a M.S. program in Computer Sci- 
ence. We used two programs written previously for other purposes, one in C of 373 symbols, and one 
in Prolog of 118. Half the subjects got the original C program and the deabbreviation of the Prolog 
program; the other half got the deabbreviation of the C program and tlie original Prolog program. We 
asked 10 multiple-choice comprehension questions in 20 minutes about the C program and 8 questions 
in 15 minutes about the Prolog program. Some questions asked about purposes ("Why are there four 
cases for median_counts7"), some about execution behavior ("What arguments to median should be 
bound before quer>'ing it?"), and some about exceptions ("What happens if the word being inserted into 
the lexicon is there already?"). Subject performance was better on both deabbreviated programs: 
7.69/10 versus 7.17/10 for the C program, and 4.64/8 vs. 3.62/8 for the Prolog program. Using Fisher’s 
2-by-2 test, we confirmed tliat output of the deabbreviator gave higher comprehensibility at significance 
level 4.8%. 

7. Conclusion 



Our experience and experiments show that a deabbreviation tool with inferencing is easy to use and can 
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definitely improve the comprehensibility of a source program. It could be educational, since it requires 
the user to carefully describe program variables and procedures. But more importantly, it could help in 
the costly task of software maintenance. 
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